ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / tsql / doc / tsql.mail / 000044_kline _Thu Mar 18 15:00:48 1993.msg < prev next >

Wrap

Internet Message Format | 1996-01-31 | 9KB

Received: from cheltenham.cs.arizona.edu by optima.cs.arizona.edu (5.65c/15) via SMTP id AA29083; Thu, 18 Mar 1993 15:00:49 MST Date: Thu, 18 Mar 1993 15:00:48 MST From: "Nick Kline" <kline> Message-Id: <199303182200.AA10374@cheltenham.cs.arizona.edu> Received: by cheltenham.cs.arizona.edu; Thu, 18 Mar 1993 15:00:48 MST To: tsql Subject: More proposed definitions Here are more proposed glossary entries. These definitions primarily concern aggregates in temporal query languages. -Nick Kline kline@cs.arizona.edu \subsection{Partitioning Attribute} The {\em partitioning attribute} is the attribute used to partition a relation into sets and is used in aggregation. All members of a set have the same value for the partitioning attribute. The sets are distinguished by different partitioning attribute values. \entry{Alternative Names} Grouping attribute. \entry{Discussion} Grouping is the accepted term, but does not denote that the subdivision is into disjoint sets, while partitioning does imply this (-E3, +E9). The partitioning attribute may be composed of several attributes, as well as a single attribute. If this is the case, then partition the relation based on the combination of the attribute values, where each unique combination of attribute values distinguishes a set. The partitioning attribute is used only in value partitioning. \subsection{Value Partitioning} {\em Value partitioning} is the partitioning of a relation based on the value of the partitioning attribute or attributes, and is used in aggregation. All tuples within a set have the same partitioning attribute value. \entry{Alternative Names} Value grouping. \entry{Discussion} Value grouping is awkward and does not adequately denote that the subdivision of the relation is into subsets where no two sets contain a common element. \subsection{Valid-time Grouping} {\em Valid-time grouping} is the grouping of the valid time-line into {\em valid-time elements}, on each of which a cumulative aggregate may then be applied. The valid-time elements may overlap and do not necessarily cover the time-line. To compute the aggregate, first determine the valid-time elements of the grouping, then assemble the tuples valid over each valid-time element into a set, and finally compute the aggregate over each of these sets. \entry{Alternative Names} Valid-time partitioning. \entry{Discussion} Grouping the time-line is a useful capability for aggregates in temporal databases (+R1,+R3). Partitioning is inappropriate because the valid-time elements may overlap; they do not necessarily form a {\it partition} since they may not cover the time-line. One example of valid-time grouping is to divide the time-line into years, based on the Gregorian calendar. Then for each year, compute the count of the tuples which overlap that year. There is no existing term for this concept. There is no grouping attribute in valid-time grouping, since the grouping does not depend on attribute values, but instead on valid times. Valid-time grouping may occur before or after value partitioning. \subsection{Dynamic Valid-time Grouping} In {\em dynamic valid-time grouping} the valid-time elements used in the grouping are determined solely from the timestamps of the relation being grouped. \entry{Alternative Names} Moving window. \entry{Discussion} The term dynamic is appropriate (as opposed to static) because if the information in the database changes, the grouping intervals may change. The intervals are determined from intrinsic information. One example of dynamic valid-time grouping would be to compute the average value of an attribute in the relation (say the salary), for the previous year before the stop-time of each tuple. A technique which could be used to compute this query would be for each tuple, find all tuples valid in the previous year before the stop-time of the tuple in question, and combine these tuples into a set. Finally, compute the average of the salary attribute values in each set. It may seem inappropriate to use valid-time elements instead of intervals, however there is no reason to exclude valid-time elements as the time-line grouping may overlap in either case. The existing term for this concept does not have an opposing term suitable to refer to dynamic valid-time grouping, and may not distinguish between the two types of valid-time grouping (-E3, +E9). Various temporal query languages have used both dynamic and static valid-time grouping, but have not always been clear about which type of grouping they support (+E1). Utilization of these terms will remove this ambiguity from future discussions. \subsection{Static Valid-time Grouping} \entry{Definition} In {\em static valid-time grouping} the valid-time elements used are determined solely from fixed points on a calendar, such as the start of each year. The valid-time elements cover the valid time-line. \entry{Alternative Names} Moving window. \entry{Discussion} This term further distinguishes existing terms (-E3, +E9). It is an obvious parallel to dynamic valid-time grouping (+E1). Static is an appropriate term because the grouping intervals are determined from extrinsic information. The grouping intervals would not change if the information in the database changed. Computing the maximum salary of employees during each month is an example which requires using static valid-time grouping. To compute this information, first divide the time-line into valid-time elements where each element represents a separate month on, say, the Gregorian calendar. Then, find the tuples valid over each valid-time element, and compute the maximum aggregate over the members of each set. \subsection{Valid-time Cumulative Aggregation} \entry{Definition} In {\em cumulative aggregation}, for each valid-time element of the valid-time grouping (produced by either dynamic or static valid-time grouping), the aggregate is applied to all tuples associated with that valid-time element. The value of the aggregate at any event is the value computed over the grouping element that contains that event. \entry{Alternative Names} Moving window. \entry{Discussion} {\em Cumulative} is used because the interesting values are defined over a cumulative range of time (+E8). This term is more precise than the existing term (-E3, +E9). Cumulative aggregation may be further restricted by valid-time grouping (c.f. static and dynamic valid-time grouping). Instantaneous aggregation may be considered to be a degenerate case of cumulative aggregation. One example of cumulative aggregation would be find the total number of employees who had worked at some point for a company. To compute this value at the end of each calendar year, then, for each year, define a valid-time element which is valid from the beginning of time up to the end of that year. For each valid-time element, find all tuples which overlap that element, and finally, count the number of tuples in each set. \subsection{Instantaneous Aggregation} \entry{Definition} In {\em instantaneous aggregation}, for each event on the valid time-line, the aggregate is applied to all tuples valid at that event. \entry{Alternative Names} None. \entry{Discussion} The term {\em instantaneous} is appropriate because the aggregate is applied over an event. It suggests an interest in the aggregate value over a very small time interval, an instant, much as acceleration is defined in physics over an infinitesimally small time (+R3). Many temporal query languages perform instantaneous aggregation, others use cumulative aggregation, while still others use a combination of the two. This term will be useful to distinguish between the various alternatives, and is already used by some researchers (+R4,+E3). \subsection{Gregorian Calendar} \entry{Definition} The {\em Gregorian calendar} is composed of 12 months, named in order, January, February, March, April, May, June, July, August, September, October, November, and December. The 12 months form a year. A year is either 365 or 366 days in length, where the extra day is used on ``leap years'', defined as years evenly divisible by 4, except for centesimal years divisible by 400. Each month has a fixed number of days, except for February, the length of which varies by a day depending on whether or not the particular year is a leap year. \entry{Alternative Names} None. \entry{Discussion} The Gregorian calendar is widely used and accepted (+E3,+E7). This term is defined and used elsewhere (-R1), but is in such common use in temporal databases that it should be defined.